System and method for handling floating point hardware exception

ABSTRACT

A method includes receiving a first input data and a second input data at a floating point arithmetic operating unit, wherein the first input data and the second input data are associated with operands of a floating point arithmetic operation respectively, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the first input data and the second input data. The method further includes determining whether the first input data is a qnan (quiet not-a-number) or whether the first input data is an snan (signaling not-a-number) prior to performing the floating point arithmetic operation. A value of the first input data is modified prior to performing the floating point arithmetic operation if the first input data is either qnan or snan, wherein the converting eliminates special handling associated with the floating point arithmetic operation on the first input data being either qnan or snan.

RELATED APPLICATIONS

The present application is a Continuation Application that claims thebenefit and priority to the Nonprovisional U.S. application Ser. No.16/864,069 that was filed on Apr. 30, 2020, which claims the benefit andpriority to the Provisional U.S. Application No. 62/950,626 that wasfiled on Dec. 19, 2019, which are incorporated herein by reference intheir entirety.

BACKGROUND

Machine learning (ML) systems are generally computationally intensiveand generally perform large amounts of floating point (FP) operations.FP arithmetic operators for the FP operations are generally compliantwith IEEE-754 standard. FP hardware exceptions are generated when inputto and/or output from a FP arithmetic operator is one of positiveinfinity, negative infinity, a signaling not-a-number (SNAN), etc. Largeamounts of resources are typically needed for handling large numbers ofFP hardware exceptions being generated from large numbers of FPoperations in the ML systems. Moreover, additional resources are neededto handle circumstances where the input to or output from the FParithmetic operators is a denormal number or when the input is a quietnot-a-number (QNAN), SNAN, infinity, etc. Denormal number refers to anon-zero number in floating point arithmetic where its magnitude issmaller than the smallest normal number

Currently, additional data paths are often needed for each FP arithmeticoperator to handle the values of inputs or outputs with denormalnumbers, QNANs, SNANs, infinities, etc. These additional data pathsresult in a larger footprint, higher power consumption, and increase incomplexity of the ML systems.

The foregoing examples of the related art and limitations relatedtherewith are intended to be illustrative and not exclusive. Otherlimitations of the related art will become apparent upon a reading ofthe specification and a study of the drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

Aspects of the present disclosure are best understood from the followingdetailed description when read with the accompanying figures. It isnoted that, in accordance with the standard practice in the industry,various features are not drawn to scale. In fact, the dimensions of thevarious features may be arbitrarily increased or reduced for clarity ofdiscussion.

FIG. 1 depicts an illustrative example of an architecture configured toefficiently handle FP hardware exception according to one aspect of thepresent embodiments.

FIG. 2 shows an illustrative example of a programmable architectureconfigured to efficiently handle FP hardware exception according to oneaspect of the present embodiments.

FIG. 3 shows an illustrative example of an architecture configured toefficiently handle FP hardware exception and tracking thereof accordingto one aspect of the present embodiments.

FIG. 4 shows an illustrative example of a method for efficientlyhandling FP hardware exception according to one aspect of the presentembodiments.

FIG. 5 shows an illustrative example of another method for efficientlyhandling FP hardware exception according to one aspect of the presentembodiments.

FIG. 6 shows an illustrative example of a block diagram depicting anexample of computer system suitable for efficient handling of FPhardware exception according to one aspect of the present embodiments isshown.

DETAILED DESCRIPTION

The following disclosure provides many different embodiments, orexamples, for implementing different features of the subject matter.Specific examples of components and arrangements are described below tosimplify the present disclosure. These are, of course, merely examplesand are not intended to be limiting. In addition, the present disclosuremay repeat reference numerals and/or letters in the various examples.This repetition is for the purpose of simplicity and clarity and doesnot in itself dictate a relationship between the various embodimentsand/or configurations discussed.

Before various embodiments are described in greater detail, it should beunderstood that the embodiments are not limiting, as elements in suchembodiments may vary. It should likewise be understood that a particularembodiment described and/or illustrated herein has elements which may bereadily separated from the particular embodiment and optionally combinedwith any of several other embodiments or substituted for elements in anyof several other embodiments described herein. It should also beunderstood that the terminology used herein is for the purpose ofdescribing the certain concepts, and the terminology is not intended tobe limiting. Unless defined otherwise, all technical and scientificterms used herein have the same meaning as commonly understood in theart to which the embodiments pertain.

According to some embodiments, input data to a FP arithmetic operator ofa ML system is modified in order to avoid generating one or more FPhardware exceptions. For a non-limiting example, value of the input data(i.e. operand) for the FP arithmetic operator is replaced with a maximumsupported number or a minimum supported number of the system when theinput data is a positive infinity or negative infinity respectively.Moreover, when an input data for a FP arithmetic operator is an SNAN,the input data may be replaced with a zero. As such, the input data asmodified would not generate a FP hardware exception resulting from itsoriginal value, e.g., positive infinity, negative infinity, an SNAN,etc. Furthermore, the input data may be manipulated to handle othercircumstances such as denormal numbers, QNAN inputs, etc., which may notcause a FP hardware exception but nonetheless may require additionaldata path for each FP arithmetic operator to handle the circumstances.In some non-limiting examples, an input data being a denormal number orbeing a QNAN input may be replaced with zero. Accordingly, the need foradditional data paths for each FP arithmetic operator to handle the FPhardware exception or to handle denormal numbers or a QNAN input iseliminated.

In some embodiments, the output of the FP arithmetic operator, e.g.,addition, subtraction, add-reduce, multiplication, negation, maximum,minimum, max-reduce, min-reduce, division, FPx to FPy (where x>y), FPxto FPy (where x<y), FP to integer (Int), etc., is similarly monitoredand replaced in order to avoid additional data paths for each FParithmetic operator. For a non-limiting example, even if the input datainto the FP arithmetic operator may not cause a FP hardware exception ora special circumstance handling, the output may nonetheless requirespecial handling. As an illustrative example, two input operands mayeach be within the supported numerical range but when added together maybe greater than the maximum supported number, thereby generatinginfinity and requiring a special handling or generating a FP hardwareexception. Accordingly, the output of a FP operator may be replaced witha maximum supported number if the FP operator results in positiveinfinity. Similarly, the output of a FP operator may be replaced with aminimum supported number if the FP operator results in negativeinfinity.

It is appreciated that in some nonlimiting examples the number of FPhardware exceptions being generated is reduced. In some nonlimitingexamples, the output of a FP operator may be replaced with zero if theoutput is a denormal number. It is appreciated that the discussion ofthe operations with respect to addition is merely for illustrationpurposes and should not be construed as limiting the scope of theembodiments. For example, similar process may take place for otheroperations, such as subtraction, add-reduce, multiplication, negation,maximum, minimum, max-reduce, min-reduce, division, FPx to FPy (wherex>y), FPx to FPy (where x<y), FP to Int, etc.

Accordingly, the need for additional data paths for each FP arithmeticoperator to handle the FP hardware exceptions or to handle denormal,infinity, SNAN or QNAN input is eliminated. Thus, hardware footprint,power consumption, and complexity are reduced.

FIG. 1 depicts an illustrative example of an architecture configured toefficiently handle FP hardware exception according to one aspect of thepresent embodiments. In some embodiments, a memory 110 is coupled to alogic engine 120, which is coupled to a convertor engine 130, which isfurther coupled to an arithmetic logic unit (ALU) 140. The ALU 140 is aFP arithmetic operator/operating unit configured to perform one or moreFP arithmetic operations, e.g., addition, subtraction, add-reduce,multiplication, negation, maximum, minimum, max-reduce, min-reduce,division, FPx to FPy (where x>y), FPx to FPy (where x<y), FP to Int,etc., for a ML operation. According to some embodiments, the memory 110stores data, e.g., numerical data, non-numerical number, etc. In someembodiments, one or more operand for a FP arithmetic operation is storedin the memory 110. It is appreciated that the operand(s) for the FParithmetic operation may be fetched and transmitted as an input data112. The logic engine 120 receives the input data 112. The logic engine120 is configured to parse the input data 112 to determine whether theinput data 112, as received, would result in a generation of a FPhardware exception once operated on by the ALU 140. For a non-limitingexample, the logic engine 120 is configured to determine whether theinput data 112 is positive infinity, negative infinity, an SNAN, etc. Inother words, the logic engine 120 is configured to determine whether aFP hardware exception would be generated prior to the FP arithmeticoperator operates on the input data (i.e. determined apriori).Furthermore, the logic engine 120 may be configured to determine whetherthe input data 112 would require a special handling once it is operatedon by the ALU 140, e.g., QNAN, denormal number, etc. It is appreciatedthat the logic engine 120 may be implemented in software according toone nonlimiting example. However, it is appreciated that the logicengine 120 may be implemented in hardware in some embodiments. As such,discussion of the embodiments with respect to software implementation isfor illustrative purposes only and should not be construed as limitingthe scope of the embodiments.

In some embodiments, the logic engine 120 may transmit the results ofits determination 122 to the convertor engine 130. For a non-limitingexample, the logic engine 120 may transmit whether the input data 112would generate a FP hardware exception or whether the input data 112would require a special handling once it is operated on by the FParithmetic operator. It is appreciated that in some nonlimitingembodiments, the determination 122 may further include the input data112. However, it is appreciated that the determination 122 including theinput data 112 is for illustrative purposes and the convertor engine 130may independently receive the input data 112, e.g., from the memory 110.The convertor engine 130 in response to the determination by the logicengine 120 may change the value or content of the input data 112. Insome non-limiting example, the input data 112 is changed to a maximumsupported number or a minimum supported number of the system when theinput data 112 is a positive infinity or negative infinity respectively.Moreover, when an input data 112 is an SNAN the input data may bereplaced with a zero. As such, the input data 112 as modified by theconvertor engine 130 would not generate a FP hardware exceptionresulting from its original value, e.g., positive infinity, negativeinfinity, an SNAN, etc. Furthermore, the input data 112 may bemanipulated to handle other circumstances such as denormal numbers, QNANinputs, etc., that may not generate a FP hardware exception butnonetheless may require additional data paths for each FP arithmeticoperator to handle the circumstances. In some nonlimiting examples, theconvertor engine 130 replaces the input data 112 with zero if the inputdata 112 is a denormal number or is a QNAN. It is appreciated that theconvertor engine 130 may leave the input data 112 unaltered if the inputdata 112 is neither a QNAN nor if it is an SNAN, positive infinity,negative infinity, or a denormal number.

In some embodiments, the convertor engine 130 outputs data 132 to theALU 140. It is appreciated that the data 132 may be the same as theinput data 112 if it is unaltered or it may be an altered version of theinput data 112, as altered by the convertor engine 130. In someembodiments, the ALU 140 is configured to perform a FP arithmeticoperation on the received data 132. It is appreciated that no FPhardware exception is generated resulting from the input data 112 beingpositive infinity, negative infinity, and an SNAN and that no specialhandling is needed for input data 112 being either a QNAN or a denormalnumber because the input data 112 is changed to avoid generation of theFP hardware exception or the need for special handling. However, eventhough the input to the ALU 140 may be a valid number, the output maynonetheless generate an exception or need special handling. For anon-limiting example, two valid numbers may generate a denormal numberwhen added to one another or result in positive or negative infinitywhen added to one another. Thus, the output of the ALU 140 is monitoredfor denormal numbers or positive or negative infinity. The output 142 ofthe ALU 140 is input to the logic engine 120 that is configured todetermine whether the output 142 is a denormal number or whether it ispositive or negative infinity. If the logic engine 120 determines thatthe output 142 is neither a denormal number nor it is positive ornegative infinity, then the logic engine 120 outputs the content 124without a need to change its value. In other words, the content 124 hasthe same value as the output 142 from the ALU 140. On the other hand, ifthe logic engine 120 determines that the content 142 is either adenormal number or positive infinity or negative infinity then thecontent 142 is transmitted as content 126 to the convertor engine 130 inorder for the content to be modified. For example, if the content 126 isa denormal number then the convertor engine 130 changes the value tozero and outputs the changed value as output 134. In contrast, if thecontent 126 is positive or negative infinity then the convertor engine130 changes the value to the maximum supported number or minimumsupported number by the system and outputs it as content 134. It isappreciated that the process is repeated for each input data (i.e.operand) for a FP arithmetic operator and its output. It is furtherappreciated that the content 126 is passed from the logic engine 120 tothe convertor engine 130 for illustrative purposes and the embodimentsshould not be construed as limiting the scope. For example, in someembodiments the convertor engine 130 may receive the content 126directly from the logic engine 120. In one nonlimiting example, theconvertor engine 130 may receive the data directly from the ALU 140.

For example, if the logic engine 120 determines that the content 142 iseither a denormal number or positive infinity or negative infinity, thenthe ALU 140 may be signaled to send the content 142 from the ALU 140 tothe convertor engine 130 in order for the content to be modified. It isappreciated that the process is repeated for each input data (i.e.operand) for a FP arithmetic operator and its output. It is appreciatedthat in some nonlimiting embodiments, the logic engine 120 and theconvertor engine 130 may be integrated within a same processing block.It is further appreciated that the communication between the integratedprocessing block and the ALU 140 may be a bidirectional communication.Moreover, it is appreciated that in some nonlimiting embodiments, thelogic engine 120, the convertor engine 130 and the ALU 140 may beintegrated within a same processing block, thereby eliminating the needfor data communication between different engine blocks.

It is appreciated that since the input data and output data is changedto avoid FP hardware exception generation or requiring special handling,the amount of required resources, power consumption, and the complexityare reduced.

FIG. 2 shows an illustrative example of a programmable architectureconfigured to efficiently handle FP hardware exception in FP arithmeticoperation according to one aspect of the present embodiments. FIG. 2 issubstantially similar to that of FIG. 1. However, it is appreciated thatin this embodiment, a rules engine 210 may be used to program the logicengine 120 and/or the convertor engine 130. In other words, thecircumstances under which the input data for the FP arithmetic operatoror its output data is changed may be user programmable. The rules engine210 enables additional FP hardware exception or special handling to bechanged or added for other situations.

FIG. 3 shows an illustrative example of an architecture configured toefficiently handle FP hardware exception and tracking thereof accordingto one aspect of the present embodiments. FIG. 3 is substantiallysimilar to that of FIG. 2. In this embodiment, however, a memory 310 maybe used to track when an input data for a FP arithmetic operator or itsoutput is changed. For example, an out-of-bounds flag 128 may begenerated and stored in the memory 310 when the input data 112 or output142 of the ALU 140 is a denormal number, positive infinity, or anegative infinity. In some examples, an un-initialized flag 128 may begenerated and stored in the memory 310 when the input data 112 is eithera QNAN or an SNAN. The generated flag may be a divide-by-zero flag whenthe dividend of a division operation is non-zero and divisor is zero.

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., addition, subtraction, or add-reduce, along with changesthereof to the input, and its output, as described above is shown below.

Inputs Changed Input FP Arithmetic Output Generated Flag (±denorm,±denorm) (±0, ±0) ±0 Out-of-bounds (±denorm, ±inf) (±0, max/min) max/minOut-of-bounds (±denorm, ±qnan) (±0, ±0) ±0 Un-initialized, Out-of-bounds(±denorm, ±snan) (±0, ±0) ±0 Un-initialized, Out-of-bounds (±denorm,±norm) (±0, ±norm) ±norm Out-of-bounds (±inf, ±inf) (max/min, max/min)max/min Out-of-bounds ±0 Out-of-bounds (±inf, ±qnan) (max/min, ±0)max/min Un-initialized, Out-of-bounds (±inf, ±snan) (max/min, ±0)max/min Un-initialized, Out-of-bounds (±inf, ±norm) (max/min, ±norm)max/min Out-of-bounds ±norm Out-of-bounds (±qnan, ±qnan) (±0, ±0) ±0Un-initialized (±qnan, ±snan) (±0, ±0) ±0 Un-initialized (±qnan, ±norm)(±0, ±norm) ±norm Un-initialized (±snan, ±snan) (±0, ±0) ±0Un-initialized (±snan, ±norm) (±0, ±norm) ±norm Un-initialized (±norm,±norm) (±norm, ±norm) ±0 Out-of-bounds max/min ±norm

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., negation, as described above is shown below.

Inputs Changed Input FP Arithmetic Output Generated Flag ±denorm ±0 ±0Out-of-bounds ±inf max/min max/min Out-of-bounds ±qnan ±0 ±0Un-initialized ±snan ±0 ±0 Un-initialized ±norm ±norm ±norm

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., multiplication, as described above is shown below.

Inputs Changed Input FP Arithmetic Output Generated Flag (±denorm,±denorm) (±0, ±0) ±0 Out-of-bounds (±denorm, ±inf) (±0, max/min) ±0Out-of-bounds (±denorm, ±qnan) (±0, ±0) ±0 Un-initialized, Out-of-bounds(±denorm, ±snan) (±0, ±0) ±0 Un-initialized, Out-of-bounds (±denorm,±norm) (±0, ±norm) ±0 Out-of-bounds (±inf, ±inf) (max/min, max/min)max/min Out-of-bounds (±inf, ±qnan) (max/min, ±0) ±0 Un-initialized,Out-of-bounds (±inf, ±snan) (max/min, ±0) ±0 Un-initialized,Out-of-bounds (±inf, ±norm) (max/min, ±norm) max/min Out-of-bounds ±normOut-of-bounds (±qnan, ±qnan) (±0, ±0) ±0 Un-initialized (±qnan, ±snan)(±0, ±0) ±0 Un-initialized (±qnan, ±norm) (±0, ±norm) ±0 Un-initialized(±snan, ±snan) (±0, ±0) ±0 Un-initialized (±snan, ±norm) (±0, ±norm) ±0Un-initialized (±norm, ±norm) (±norm, ±norm) ±0 Out-of-bounds max/minOut-of-bounds ±norm

It is appreciated that the multiplication operation may have a 32-bitinput/output but the operation may be performed as a 16-bit floatingpoint. In other words, a 32-bit floating point data is converted into a16-bit floating point number and its value may be clipped to a maximum,minimum or zero. The conversion may cause an out-of-bound exception thatis handled according to the embodiments, as described above.

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., maximum, minimum, max-reduce, min-reduce, as describedabove is shown below.

Inputs Changed Input FP Arithmetic Output Generated Flag (±denorm,±denorm) (±0, ±0) ±0 Out-of-bounds (±denorm, ±inf) (±0, max/min) max/minOut-of-bounds ±0 Out-of-bounds (±denorm, ±qnan) (±0, ±0) ±0Un-initialized, Out-of-bounds (±denorm, ±snan) (±0, ±0) ±0Un-initialized, Out-of-bounds (±denorm, ±norm) (±0, ±norm) ±0Out-of-bounds ±norm Out-of-bounds (±inf, ±inf) (max/min, max/min)max/min Out-of-bounds (±inf, ±qnan) (max/min, ±0) ±0 Un-initialized,Out-of-bounds max/min Un-initialized, Out-of-bounds (±inf, ±snan)(max/min, ±0) ±0 Un-initialized, Out-of-bounds max/min Un-initialized,Out-of-bounds (±inf, ±norm) (max/min, ±norm) max/min Out-of-bounds ±normOut-of-bounds (±qnan, ±qnan) (±0, ±0) ±0 Un-initialized (±qnan, ±snan)(±0, ±0) ±0 Un-initialized (±qnan, ±norm) (±0, ±norm) ±0 Un-initialized±norm Un-initialized (±snan, ±snan) (±0, ±0) ±0 Un-initialized (±snan,±norm) (±0, ±norm) ±0 Un-initialized ±norm Un-initialized (±norm, ±norm)(±norm, ±norm) ±norm

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., division, as described above is shown below.

Inputs Changed Input FP Arithmetic Output Generated Flag (±denorm,±denorm) (±0, ±0) ±0 Out-of-bounds (±denorm, ±inf) (±0, max/min) ±0Out-of-bounds (±denorm, ±qnan) (±0, ±0) ±0 Un-initialized, Out-of-bounds(±denorm, ±snan) (±0, ±0) ±0 Un-initialized, Out-of-bounds (±denorm,±norm) (±0, ±norm) ±0 Out-of-bounds (±inf, ±inf) (max/min, max/min) ±1Out-of-bounds (±inf, ±qnan) (max/min, ±0) max/min Un-initialized,Out-of-bounds, Divide-by-zero (±inf, ±snan) (max/min, ±0) max/minUn-initialized, Out-of-bounds, Divide-by-zero (±inf, ±norm) (max/min,±norm) max/min Out-of-bounds, Divide-by-zero ±norm Out-of-bounds,Divide-by-zero (±qnan, ±denorm) (±0, ±0) ±0 Un-initialized,Out-of-bounds (±qnan, ±inf) (±0, max/min) ±0 Un-initialized,Out-of-bounds (±qnan, ±qnan) (±0, ±0) ±0 Un-initialized (±qnan, ±snan)(±0, ±0) ±0 Un-initialized (±qnan, ±norm) (±0, ±norm) ±0 Un-initialized(±snan, ±denorm) (±0, ±0) ±0 Un-initialized, Out-of-bounds (±snan, ±inf)(±0, max/min) ±0 Un-initialized, Out-of-bounds (±snan, ±qnan) (±0, ±0)±0 Un-initialized (±snan, ±snan) (±0, ±0) ±0 Un-initialized (±snan,±norm) (±0, ±norm) ±0 Un-initialized (±norm, ±denorm) (±norm, ±0)max/min Out-of-bounds, Divide-by-zero (±norm, ±inf) (±norm, max/min) ±0Out-of-bounds ±norm Out-of-bounds (±norm, ±qnan) (±norm, ±0) max/minUn-initialized, Divide-by-zero (±norm, ±snan) (±norm, ±0) max/minUn-initialized, Divide-by-zero (±norm, ±norm) (±norm, ±norm) ±0Out-of-bounds max/min Out-of-bounds ±norm

It is appreciated that the division operation may have a 32-bitinput/output similar to multiplication, as described above. In otherwords, the operation may be performed as a 16-bit floating point. Inother words, a 32-bit floating point data is converted into a 16-bitfloating point number and its value may be clipped to a maximum, minimumor zero. The conversion may cause an out-of-bound exception or a divideby zero exception that is handled according to the embodiments, asdescribed above.

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., FPx to FPy (where x>y), as described above is shownbelow.

Inputs Changed Input FP Arithmetic Output Generated Flag ±denorm ±0 ±0Out-of-bounds ±inf max/min (FPx) max/min (FPy) Out-of-bounds ±qnan ±0 ±0Un-initialized ±snan ±0 ±0 Un-initialized ±norm ±norm ±0 Out-of-boundsmax/min (FPy) Out-of-bounds ±norm

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., FPx to FPy (where x<y), as described above is shownbelow.

Inputs Changed Input FP Arithmetic Output Generated Flag ±denorm ±0 ±0Out-of-bounds ±inf max/min (FPx) max/min (FPx) Out-of-bounds ±qnan ±0 ±0Un-initialized ±snan ±0 ±0 Un-initialized ±norm ±norm ±norm

For illustrative purposes that should not be construed as limiting thescope of the embodiments, various input data for a FP arithmeticoperator, e.g., FPx to Int, as described above is shown below.

Inputs Changed Input FP Arithmetic Output Generated Flag ±denorm ±0 ±0Out-of-bounds ±inf max/min max/min (int) Out-of-bounds int Out-of-bounds±qnan ±0 ±0 Un-initialized ±snan ±0 ±0 Un-initialized ±norm ±norm ±0Out-of-bounds max/min Out-of-bounds int

It is appreciated that the floating point to integer operation may havea 16-bit input and as such it may not need a 32-bit to 16-bitconversion. In other words, the operation may be performed as a 16-bitfloating point and it may be converted to integer, e.g., int9 (asdescribed and as incorporated by reference in its entirety in patentapplication number <TBD>, filed on <TBD>, entitled “System and Methodfor INT9 Quantization). The value may be clipped to integer maximum orminimum and it may trigger out-of-bounds exception that is handledaccording to the embodiments, as described above.

FIG. 4 shows an example of a method for efficiently handling FP hardwareexception according to one aspect of the present embodiments. At step410, an input data is received, e.g., a FP number, QNAN, SNAN, denormalnumber, etc., as described above with respect to FIGS. 1-3. At step 420,it is determined whether the received input data generates a FP hardwareexception responsive to a FP arithmetic operation on the input data. Itis appreciated that the determination occurs prior to performing the FParithmetic operation. For example, it is determined whether the inputdata is a positive infinity, negative infinity, SNAN, etc., as describedabove with respect to FIGS. 1-3. Furthermore, it is determined whetherthe input data requires a special handling, e.g., whether the input datais a denormal number, whether the input data is a QNAN, etc., asdescribed above with respect to FIGS. 1-3. In response to determiningthat the input data generates a FP hardware exception if operated on bya FP arithmetic operator, the input data is changed at step 430. For anon-limiting example, if the input data is positive infinity then thevalue of the input data is changed to the maximum supported number bythe system, if the input data is negative infinity then the value of theinput data is changed to the minimum supported number by the system, ifthe input data is an SNAN then the input data is changed to a zerovalue, etc., as described above with respect to FIGS. 1-3. Accordingly,the changing of the value of the input data eliminates FP hardwareexception generation once the input data is operated on by the FParithmetic operator. In some embodiments, the value of the input data isalso changed if the original input data value requires a specialhandling. For a non-limiting example, if the input data is a denormalnumber then the value of the input data is changed to zero, if the inputdata is QNAN then the value of the input data is changed to zero, etc.At step 440, the input data (i.e. changed value or original value) isoperated on by the FP arithmetic operator, as described in FIGS. 1-3.The FP arithmetic operation may be an addition operation, a subtractionoperation, an add-reduce operation, multiplication operation, negationoperation, maximum operation, minimum operation, max-reduce operation,mMin-reduce operation, division operation, FPx to FPy (where x>y)operation, FPx to FPy (where x<y) operation, FP to Int operation, etc.

At step 450, it is determined whether the output result of the FParithmetic operation generates a FP hardware exception, before a FPhardware exception is generated. For example, if the output result ofthe FP arithmetic operator is a positive infinity, a negative infinity,etc., then it is determined that the output result would generate a FPhardware exception, as described in FIGS. 1-3. At step 460, responsiveto determining that the output result would generate a FP hardwareexception the value of the output result is changed, e.g., positiveinfinity is changed to a maximum supported number by the system,negative infinity is changed to a minimum supported number by thesystem, etc. It is further appreciated that in some embodiment, if theoutput result is a denormal number, the value of the output result ischanged to zero.

It is appreciated that at step 470, a flag is optionally generated whenthe input data is determined to generate a FP hardware exception (priorto generating a FP hardware exception), or when the input data wouldrequire a special handling (i.e. QNAN, denormal number, etc.), or whenthe output result of the FP arithmetic operator would generate a FPhardware exception (prior to the FP hardware exception being generated)or if the output result would require a special handing (i.e. outputresult is a denormal number), etc. The generated flag may be anout-of-bounds flag if the value is positive infinity, negative infinity,a denormal number, etc. The generated flag may be an un-initialized flagwhen the data is a QNAN or an SNAN. The generated flag may be adivide-by-zero flag when the dividend of a division operation isnon-zero and divisor is zero.

FIG. 5 shows an example of another method for efficiently handling FPhardware exception in FP arithmetic operation according to one aspect ofthe present embodiments. At step 510, a first and a second input dataare received for a FP arithmetic operation, e.g., addition, subtraction,add-reduce, multiplication, negation, maximum, minimum, max-reduce,min-reduce, division, FPx to FPy (where x>y), FPx to FPy (where x<y), FPto Int, etc., as described in FIGS. 1-3. At step 520, the first/secondinput data is set to zero if the first/second input data is a denormalnumber, a QNAN, an SNAN, etc., as described in FIGS. 1-3. At step 530,the first/second input data is set to a maximum supported value if thefirst/second input data is positive infinity, and is set to a minimumsupported value if the first/second input data is a negative infinity.It is appreciated that if any of the input data is an SNAN, QNAN,denormal, positive infinity, or negative infinity, then a FP hardwareexception is generated if operated on by the FP arithmetic operator. Atstep 540, the first and the second input data (i.e. changed value ororiginal value) are operated on by the FP arithmetic operator, asdescribed in FIGS. 1-3. At step 550, it is determined whether the outputresult of the FP arithmetic operation generates a FP hardware exception,before a FP hardware exception is generated. For a non-limiting example,if the output result of the FP arithmetic operator is a positiveinfinity, a negative infinity, etc., then it is determined that theoutput result would generate a FP hardware exception, as described inFIGS. 1-3. At step 560, responsive to determining that the output resultwould generate a FP hardware exception the value of the output result ischanged, e.g., positive infinity is changed to a maximum supportednumber by the system, negative infinity is changed to a minimumsupported number by the system, etc. It is further appreciated that insome embodiment, if the output result is a denormal number the value ofthe output result is changed to zero.

It is appreciated that at step 570, a flag is optionally generated whenthe first or the second input data is determined to generate a FPhardware exception (prior to generating a FP hardware exception), orwhen the first or the second input data would require a special handling(i.e. QNAN, denormal number, etc.), or when the output result of the FParithmetic operator would generate a FP hardware exception (prior to theFP hardware exception being generated) or if the output result wouldrequire a special handing (i.e. output result is a denormal number),etc. The generated flag may be an out-of-bounds flag if the value ispositive infinity, negative infinity, a denormal number, etc. Thegenerated flag may be an un-initialized flag when the input data is QNANor SNAN. The generated flag may be a divide-by-zero flag when thedividend of a division operation is non-zero and divisor is zero.

Referring now to FIG. 6, a block diagram depicting an example ofcomputer system suitable for efficient handling of FP hardware exceptionin accordance with some embodiments is shown. In some examples, computersystem 1100 can be used to implement computer programs, applications,methods, processes, or other software to perform the above-describedtechniques and to realize the structures described herein. Computersystem 1100 includes a bus 1102 or other communication mechanism forcommunicating information, which interconnects subsystems and devices,such as a processor 1104, a system memory (“memory”) 1106, a storagedevice 1108 (e.g., ROM), a disk drive 1110 (e.g., magnetic or optical),a communication interface 1112 (e.g., modem or Ethernet card), a display1114 (e.g., CRT or LCD), an input device 1116 (e.g., keyboard), and apointer cursor control 1118 (e.g., mouse or trackball). In oneembodiment, pointer cursor control 1118 invokes one or more commandsthat, at least in part, modify the rules stored, for example in memory1106, to define the electronic message preview process.

According to some examples, computer system 1100 performs specificoperations in which processor 1104 executes one or more sequences of oneor more instructions stored in system memory 1106. Such instructions canbe read into system memory 1106 from another computer readable medium,such as storage device 1108 or disk drive 1110. In some examples,hard-wired circuitry can be used in place of or in combination withsoftware instructions for implementation. In the example shown, systemmemory 1106 includes modules of executable instructions for implementingan operation system (“O/S”) 1132, an application 1136 (e.g., a host,server, web services-based, distributed (i.e., enterprise) applicationprogramming interface (“API”), program, procedure or others). Further,application 1136 includes a logic engine 1138 that determines whetherthe input data would generate a FP hardware exception if operated on bythe FP arithmetic operator or if the input data requires a specialhandling (i.e. denormal number, QNAN, etc.), as described above in FIGS.1-5. The application 1136 further includes a convertor engine 1141 thatchanges the value of the input data or output result of the FParithmetic operator if the logic engine 1138 determines that notchanging the value would generate a FP hardware exception or if notchanging the value would require a special handling, as described inFIGS. 1-5.

The term “computer readable medium” refers, at least in one embodiment,to any medium that participates in providing instructions to processor1104 for execution. Such a medium can take many forms, including but notlimited to, non-volatile media, volatile media, and transmission media.Non-volatile media includes, for example, optical or magnetic disks,such as disk drive 1110. Volatile media includes dynamic memory, such assystem memory 1106. Transmission media includes coaxial cables, copperwire, and fiber optics, including wires that comprise bus 1102.Transmission media can also take the form of acoustic or light waves,such as those generated during radio wave and infrared datacommunications.

Common forms of computer readable media include, for example, floppydisk, flexible disk, hard disk, magnetic tape, any other magneticmedium, CD-ROM, any other optical medium, punch cards, paper tape, anyother physical medium with patterns of holes, RAM, PROM, EPROM,FLASH-EPROM, any other memory chip or cartridge, electromagneticwaveforms, or any other medium from which a computer can read.

In some examples, execution of the sequences of instructions can beperformed by a single computer system 1100. According to some examples,two or more computer systems 1100 coupled by communication link 1120(e.g., LAN, PSTN, or wireless network) can perform the sequence ofinstructions in coordination with one another. Computer system 1100 cantransmit and receive messages, data, and instructions, including programcode (i.e., application code) through communication link 1120 andcommunication interface 1112. Received program code can be executed byprocessor 1104 as it is received, and/or stored in disk drive 1110, orother non-volatile storage for later execution. In one embodiment,system 1100 is implemented as a hand-held device. But in otherembodiments, system 1100 can be implemented as a personal computer(i.e., a desktop computer) or any other computing device. In at leastone embodiment, any of the above-described delivery systems can beimplemented as a single system 1100 or can implemented in a distributedarchitecture including multiple systems 1100.

In other examples, the systems, as described above, can be implementedfrom a personal computer, a computing device, a mobile device, a mobiletelephone, a facsimile device, a personal digital assistant (“PDA”) orother electronic device.

In at least some of the embodiments, the structures and/or functions ofany of the above-described interfaces and panels can be implemented insoftware, hardware, firmware, circuitry, or a combination thereof. Notethat the structures and constituent elements shown throughout, as wellas their functionality, can be aggregated with one or more otherstructures or elements.

Alternatively, the elements and their functionality can be subdividedinto constituent sub-elements, if any. As software, the above-describedtechniques can be implemented using various types of programming orformatting languages, frameworks, syntax, applications, protocols,objects, or techniques, including C, Objective C, C++, C#, Flex™,Fireworks®, Java™, Javascript™, AJAX, COBOL, Fortran, ADA, XML, HTML,DHTML, XHTML, HTTP, XMPP, and others. These can be varied and are notlimited to the examples or descriptions provided.

The foregoing description of various embodiments of the claimed subjectmatter has been provided for the purposes of illustration anddescription. It is not intended to be exhaustive or to limit the claimedsubject matter to the precise forms disclosed. Many modifications andvariations will be apparent to the practitioner skilled in the art.Embodiments were chosen and described in order to best describe theprinciples of the invention and its practical application, therebyenabling others skilled in the relevant art to understand the claimedsubject matter, the various embodiments and the various modificationsthat are suited to the particular use contemplated.

What is claimed is:
 1. A computer-implemented method comprising:receiving a first input data and a second input data at a floating pointarithmetic operating unit, wherein the first input data and the secondinput data are associated with operands of a floating point arithmeticoperation respectively, wherein the floating point operating unit isconfigured to perform the floating point arithmetic operation on thefirst input data and the second input data; determining whether thefirst input data is a quiet not-a-number (qnan) or whether the firstinput data is a signaling not-a-number (snan) prior to performing thefloating point arithmetic operation; and converting a value of the firstinput data to a modified value prior to performing the floating pointarithmetic operation if the first input data is either the qnan or thesnan, wherein the converting eliminates special handling on the inputdata by the floating point arithmetic operating unit when the firstinput data is either the qnan or the snan.
 2. The method of claim 1further comprising performing the floating point arithmetic operation onthe second input data and the first input data that has been modified togenerate an output result.
 3. The method of claim 1, wherein thefloating point arithmetic operation is selected from one of additionoperation, a subtraction operation, an add-reduce operation, a maximumoperation, a minimum operation, a max-reduce operation, a min-reduceoperation, a multiplication operation, or a division operation.
 4. Themethod of claim 2 further comprising determining whether the outputresult of the floating point arithmetic operation generates a floatingpoint hardware exception.
 5. The method of claim 4 further comprisingsetting a value of the output result to zero if a value of the outputresult is a denormal number.
 6. The method of claim 4 further comprisingsetting a value of the output result to a maximum supported number if avalue of the output result is a positive infinity, and setting a valueof the output result to a minimum supported number if a value of theoutput result is a negative infinity.
 7. The method of claim 1, whereinthe converting of the value of the first input data to the modifiedvalue is setting the value of the first input data to zero if the firstinput data is the qnan.
 8. The method of claim 7 further comprisinggenerating an un-initialized flag associated with the first input databeing the qnan.
 9. The method of claim 1, wherein the converting of thevalue of the first input data to the modified value is setting the valueof the first input data to zero if the first input data is the snan. 10.The method of claim 9 further comprising generating an un-initializedflag associated with the first input data being the snan.
 11. Acomputer-implemented method comprising: receiving a first input data anda second input data at a floating point arithmetic operating unit,wherein the first input data and the second input data are associatedwith operands of a floating point arithmetic operation respectively,wherein the floating point operating unit is configured to perform afloating point arithmetic operation on the first input data and thesecond input data; determining whether the first input data is a quietnot-a-number (qnan) or whether the first input data is a signalingnot-a-number (snan) prior to performing the floating point arithmeticoperation; and setting the first input data to zero if the first inputdata is the qnan or the snan, wherein the setting occurs prior toperforming the floating point arithmetic operation.
 12. The method ofclaim 11 further comprising performing the floating point arithmeticoperation on the second input data and the first input data that hasbeen modified to generate an output result.
 13. The method of claim 11,wherein the floating point arithmetic operation is selected from one ofaddition operation, a subtraction operation, an add-reduce operation, amaximum operation, a minimum operation, a max-reduce operation, amin-reduce operation, a multiplication operation, or a divisionoperation.
 14. The method of claim 12 further comprising determiningwhether the output result of the floating point arithmetic operationgenerates a floating point hardware exception.
 15. The method of claim14 further comprising setting a value of the output result to zero if avalue of the output result is a denormal number.
 16. The method of claim14 further comprising setting a value of the output result to a maximumsupported number if a value of the output result is a positive infinity,and setting a value of the output result to a minimum supported numberif a value of the output result is a negative infinity.
 17. The methodof claim 11 further comprising generating an un-initialized flagassociated with the first input data being the qnan or snan.
 18. Asystem comprising: a logic engine configured to receive a first inputdata and a second input data at a floating point arithmetic operatingunit, wherein the first input data and the second input data areassociated with operands of a floating point arithmetic operationrespectively, wherein the floating point operating unit is configured toperform a floating point arithmetic operation on the first input dataand the second input data, determine whether the first input data is aquiet not-a-number (qnan) or whether the first input data is a signalingnot-a-number (snan) prior to performing the floating point arithmeticoperation; and a convertor engine configured to convert a value of thefirst input data to a modified value prior to performing the floatingpoint arithmetic operation if the first input data is either the qnan orthe snan, wherein the converting eliminates special handling by thefloating point arithmetic operating unit on the first input data beingeither the qnan or the snan.
 19. The system of claim 18 furthercomprising said arithmetic floating point operating unit configured toperform the floating point arithmetic operation on the first input datathat has been modified and on the second input data to generate anoutput result.
 20. The system of claim 18, wherein the floating pointarithmetic operation is selected from one of addition operation, asubtraction operation, an add-reduce operation, a maximum operation, aminimum operation, a max-reduce operation, a min-reduce operation, amultiplication operation, or a division operation.
 21. The system ofclaim 19, wherein the logic engine is configured to determine whetherthe output result of the floating point arithmetic operation generates afloating point hardware exception.
 22. The system of claim 21, whereinthe logic engine is configured to set a value of the output result tozero if a value of the output result is a denormal number.
 23. Thesystem of claim 21, wherein the logic engine is configured to set avalue of the output result to a maximum supported number if a value ofthe output result is a positive infinity, and setting a value of theoutput result to a minimum supported number if a value of the outputresult is a negative infinity.
 24. The system of claim 18, wherein theconvertor engine is configured to set a value of the first input data tozero if the first input data is the qnan.
 25. The system of claim 24,wherein the logic engine is configured to generate an un-initializedflag associated with the first input data being qnan.
 26. The system ofclaim 18, wherein the convertor engine is configured to set a value ofthe first input data to zero if the first input data is the snan. 27.The system of claim 26, wherein the logic engine is configured togenerate an un-initialized flag associated with the input data beingsnan.
 28. A system comprising: a means for receiving a first input dataand a second input data at a floating point arithmetic operating unit,wherein the first input data and the second input data are associatedwith operands of a floating point arithmetic operation respectively,wherein the floating point operating unit is configured to perform afloating point arithmetic operation on the first input data and thesecond input data; a means for determining whether the first input datais a quiet not-a-number (qnan) or whether the first input data is asignaling not-a-number (snan) prior to performing the floating pointarithmetic operation; and a means for setting the first input data tozero if the first input data is the qnan or the snan, wherein thesetting occurs prior to performing the floating point arithmeticoperation.
 29. The system of claim 28 further comprising a means forperforming the floating point arithmetic operation on the second inputdata and the first input data that has been modified to generate anoutput result.
 30. The system of claim 29 further comprising a means fordetermining whether the output result of the floating point arithmeticoperation generates a floating point hardware exception.
 31. The systemof claim 30 further comprising a means for setting a value of the outputresult to zero if a value of the output result is a denormal number. 32.The system of claim 30 further comprising a means for setting a value ofthe output result to a maximum supported number if a value of the outputresult is a positive infinity, and setting a value of the output resultto a minimum supported number if a value of the output result is anegative infinity.
 33. The system of claim 28 further comprising a meansfor generating an un-initialized flag associated with the first inputdata being the qnan or snan.